MCIC Wooster, Ohio State University, USA
2024-02-08
Today, we will assemble a bacterial genome from one of the pairs of FASTQ files that you copied yesterday morning:
total 6.1G
-rw-r--r-- 1 jelmer PAS0471 205M Feb 7 11:21 SM04_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 242M Feb 7 11:21 SM04_R2.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 188M Feb 7 11:21 SM1030_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 221M Feb 7 11:21 SM1030_R2.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 187M Feb 7 11:21 SM1031_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 221M Feb 7 11:21 SM1031_R2.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 187M Feb 7 11:21 SM1038_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 224M Feb 7 11:21 SM1038_R2.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 176M Feb 7 11:21 SM1042_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 199M Feb 7 11:21 SM1042_R2.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 172M Feb 7 11:21 SM109_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 198M Feb 7 11:21 SM109_R2.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 157M Feb 7 11:21 SM155_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 181M Feb 7 11:21 SM155_R2.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 155M Feb 7 11:21 SM156_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 185M Feb 7 11:21 SM156_R2.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 146M Feb 7 11:21 SM181_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 159M Feb 7 11:21 SM181_R2.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 195M Feb 7 11:21 SM190_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 241M Feb 7 11:21 SM190_R2.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 192M Feb 7 11:21 SM191_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 234M Feb 7 11:21 SM191_R2.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 194M Feb 7 11:21 SM205_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 242M Feb 7 11:21 SM205_R2.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 176M Feb 7 11:21 SM207_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 210M Feb 7 11:21 SM207_R2.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 137M Feb 7 11:21 SM226_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 165M Feb 7 11:21 SM226_R2.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 192M Feb 7 11:21 SM51_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 224M Feb 7 11:21 SM51_R2.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 193M Feb 7 11:21 SM914_R1.fastq.gz
-rw-r--r-- 1 jelmer PAS0471 226M Feb 7 11:21 SM914_R2.fastq.gzBest possible resolution for, e.g.:
The only way to:
From Weisberg et al. 2021
From Koonin et al. 2021
From Wick et al. 2023
From Wick et al. 2023
After library prep, each DNA fragment is flanked by several types of short sequences that together make up the “adapters”:
In Illumina sequencing, DNA fragments can be sequenced from both ends as shown below — this is called “paired-end” (PE) sequencing:
When sequencing is instead single-end (SE), no reverse read is produced:
Paired-end sequencing is a way to effectively increase the read length.
The total size of the biological DNA fragment (without adapters) is often called the insert size:
Insert size varies based on the library prep protocol aims, and because of variation due to limited precision in size selection. In some cases, the insert size can be:
The different templates within a cluster get out of sync because occasionally:
They miss a base incorporation
They incorporate two bases at once
This error profile is why, for Illumina:
Pseudomonas syringae causes disease in a wide range of host plants, from Solanaceae and Leguminosae plants to citrus and stone fruit trees.
Pseudomonas syringae pv. syringae (Pss) is an emerging phytopathogen that causes Pseudomonas leaf spot (PLS) disease in pepper plants (Capsicum annuum var. annuum).
Copper-based antimicrobials are used as chemical control methods for Pss in peppers. This has resulted in the emergence of copper-resistant strains of Pss.
From Ranjit et al. in prep, Ohio State University
16 Pss samples were isolated from pepper plants harboring characteristic PLS symptoms in Ohio between 2013 and 2021.
Illumina MiSeq sequencing of these isolates: 2x300 bp reads
From Koonin et al. 2021